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Abstract. We study the level spacing distribution p(s) in the spectrum of random 
networks. According to our numerical results, the shape of p(s) in the Erdos-Renyi 
(E-R) random graph is determined by the average degree (k), and p(s) undergoes 
a dramatic change when (k) is varied around the critical point of the percolation 
transition, (k) = 1. When (k) >> 1, the p(s) is described by the statistics of 
the Gaussian Orthogonal Ensemble (GOE), one of the major statistical ensembles 
in Random Matrix Theory, whereas at (k) = 1 it follows the Poisson level spacing 
distribution. Closely above the critical point, p(s) can be described in terms of 
an intermediate distribution between Poisson and the GOE, the Brody-distribution. 
Furthermore, below the critical point p(s) can be given with the help of the regularised 
Gamma-function. Motivated by these results, we analyse the behaviour of p(s) in 
real networks such as the Internet, a word association network and a protein protein 
interaction network as well. When the giant component of these networks is destroyed 
in a node deletion process simulating the networks subjected to intentional attack, their 
level spacing distribution undergoes a similar transition to that of the E-R graph. 
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1. Introduction 

A wide class of complex systems occurring from the level of cells to society can be 
described in terms of networks capturing the intricate web of connections among the 
units they are made of. Whenever many similar objects in mutual interactions are 
encountered, these objects can be represented as nodes and the interactions as links 
between the nodes, defining a network. The world- wide- web, the science citation index, 
and biochemical reaction pathways in living cells are all good examples of complex 
systems widely modeled with networks, and the set of further phenomena where the 
network approach can be used is even more diverse. Graphs corresponding to such 
real networks exhibit unexpected non-trivial properties, e.g., new kinds of degree 
distributions, anomalous diameter, spreading phenomena, clustering coefficient, and 
correlations [H El El HI [5] . In most cases, the overall structure of networks reflect the 
characteristic properties of the original systems, and enable one to sort seemingly very 
different systems into a few major classes of stochastic graphs [2111]. These developments 
have greatly advanced the potential to interpret the fundamental common features of 
such diverse systems as social groups, technological, biological and other networks. 

Another general approach to the analysis of complex systems is provided by Random 
Matrix Theory (RMT), originally proposed by Wigner and Dyson in 1967 for the study of 
the spectrum of nuclei [6]. Since then, RMT has been successfully used in investigations 
ranging from the studies of phase transitions in disordered systems [7j, through the 
spectral analysis of chaotic systems [8] and the stock market [9] to the studies of 
brain responses [10]. Recently, the network approach to complex systems and the 
RMT were combined in the analysis of the modular structure of biological networks 
[IT] . Network modules, also called as communities, cohesive groups, clusters, etc. 
correspond to structural sub-units, associated with more highly interconnected parts, 
with no unique definition [El[E3l[El[E5l[EIH[E3[^ Such 
building blocks (functionally related proteins [261 121] , industrial sectors [28] , groups of 
people [HI 29J, cooperative players [HO, [31], etc.) can play a crucial role in forming 
the structural and functional properties of the involved networks, therefore there has 
been a quickly growing interest in the last few years in developing efficient methods for 
locating these modules. One of the most well known community finding algorithm today 
is the Girvan-Newman algorithm [131 EH] > which is based on recursive deletion of links 
with the highest betweenness. This process leads to splitting of the network to smaller 
parts, corresponding to the communities, and the deletion of the links is stopped, when 
optimal modularity is reached. 

In the analysis of a protein-protein interaction network and a metabolic network, 
Luo et al. found that the fluctuations of the level spacing in the spectrum obey different 
statistics when the networks are split to the communities given by the Girvan-Newman 
algorithm compared to the original state [UJ. (The spectrum of a network is given 
by the eigenvalues of its adjacency matrix [32l [33j [34]). For both networks, in the 
original state the fluctuations of the level spacing followed the statistics of the Gaussian 
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Orthogonal Ensemble (GOE), one of the major statistical ensembles in RMT. However, 
when the networks were split to communities, the fluctuations in the level spacing 
became Poissonean, which is another important statistics in RMT. Based on this effect, 
Luo et al. proposed that the monitoring of such changes in the spectral properties can 
help the identifications of network modules. 

Motivated by these very interesting results, here we study the level spacing 
fluctuations in the spectrum of networks in a more general frame work. Our 
investigations of the Erdos-Renyi (E-R) random graph, the Internet, a word association 
graph and a protein-protein interaction graph show that similar spectral transitions 
occur in these networks as well. However, our results indicate that such transitions in 
the spectrum are more likely to be connected to the appearance of a giant component 
than to the ideal partitioning of the network, since e.g. in the E-R graph communities are 
totally absent. The paper is organised as follows: first we summarise the most important 
properties of the level spacing distribution in RMT, then describe our results for the 
spectral transitions in the E-R graph. Finally, we show that similar spectral transitions 
can be induced in real networks as well, simply by destroying the giant component, 
without invoking any sophisticated partitioning of the network to communities. 

2. The level spacing distribution 

The main object of study in RMT is the set of eigenvalues {e^} of the random matrix 
representing the system under investigation. In case of networks, this matrix corresponds 
to the adjacency matrix, in which the entry Ay = 1 if the nodes i and j are linked, 
otherwise A^- = 0. (For simplicity, let us neglect the possible directionality and weight 
of the links). One of the most important results of RMT is that complex systems can 
be sorted into a few universal classes based on the behaviour of the fluctuations in 
the level spacing between these eigenvalues. The level spacing S between two adjacent 
eigenvalues is simply = e^+i — e,, however the distribution of this quantity cannot be 
universal, as there are systems in which eigenvalues are more dense/sparse on average 
compared to others. Therefore, instead the the unfolded level spacings are studied, 
which can be defined as 



where (S) i denotes the local average of the level spacing in the vicinity of e«. The 
probability distribution of the unfolded level spacings (which from now on we shall call 
simply as the level spacing distribution) can be described with the probability density 
p(s) and the corresponding cumulative distribution P(s) = f^p(x)dx. Due to the 
unfolding ([!]), the expectation value of the level spacing is one: 



The level spacing distribution of systems with strongly correlated eigenvalues follows the 
statistics of the GOE, defined as an ensemble of random matrices filled with elements 




(2) 
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drawn from a Gaussian distribution. In this case p(s) and P(s) are given by the Wigner- 
Dyson distribution [35] as 



71 ( 7T 2 



Pgoe(s) = -sexp (--s ) , (3a) 



iWs) = l-exp(-^ S 2 J. (36) 

Another important universality class is formed by the systems with no correlation 
between the eigenvalues, following a Poisson level spacing distribution: 

p (s) = exp(-s), (4a) 
P (s) = l-exp(-s). (46) 

In chaotic systems with weak disorder, intermediate statistics were observed as well, 
described by the Brody-distribution [361 EH M, US HQ] : 

Pb(s) =Cas a - 1 exp(-Cs a ) , (5a) 
P B (s) = 1 -exp(-Cs a ). (56) 

where C is a normalising constant ensuring the fulfil of Eq.(j2]), and the parameter a 
determines how far the distribution falls from the two limiting cases. (At a = 1 we 
recover the Poisson-distribution, whereas a = 2 corresponds to the statistics of the 
GOE). In the next section we shall analyse the level spacing distribution of the E-R 
graph. 



3. Spectral transition in the E-R graph 

The concept of random graphs was introduced by Erdos and Renyi [H] in the 1950s in a 
simple model starting with N nodes, and connecting every pair of nodes independently 
with the same probability p. Even though real networks differ from this simple model in 
many aspects, the E-R uncorrelated random graph remains still of great interest, since 
such a graph can serve both as a test bed for checking all sorts of new ideas concerning 
complex networks in general, and as a prototype of random graphs to which all other 
random graphs can be compared. 

Perhaps the most conspicuous early result on the E-R graphs was related to the 
percolation transition taking place at p — 1/N. The appearance of a giant component in 
a network, which is also referred to as the percolating component, results in a dramatic 
change in the overall topological features of the graph and has been in the centre of 
interest for other networks as well. The relative size of the largest component compared 
to the total number of nodes is determined by the average degree (k) = pN, and the 
critical point of the transition is at (k) = 1. 

In our studies concerning the level spacing distribution of the E-R graph, we 
observed a similar phenomenon: the shape of p(s) is determined by (k), or in other 
words, the p(s) of E-R graphs with the same average degree follow the same curve. 
In Fig{T] we demonstrate this effect by plotting the level spacing distribution for E-R 
graphs of size N = 5000 (circles), iV = 7000 (squares) and N = 10000 (triangles), 
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with average degree (k) = 0.5 (white symbols), (k) = 1 (gray symbols), and (k) = 1.5 
(black symbols). (For each parameter setting, the spectrum of several different instances 
of E-R graphs with the given N and (k) was evaluated numerically, and the resulting 
level spacing distributions were averaged). Beside the data collapse for the different 
N parameters, it can be seen that the level spacing distribution undergoes a dramatic 
change when (k) is varied around (k) = 1. The p(s) at (k) = 1, the critical point of 
the percolation transition (denoted by gray symbols) is exponential, whereas it shows a 
somewhat more complex forms for both (k) < 1 and for (k) > 1. 



<k>=0.5 
<k>=l.O 
<k> = l.5 



7V=5000 TV=7000 N= 10000 




Figure 1. The level spacing distribution p(s) of E-R graphs of size N = 5000 
(circles), N = 7000 (squares), and N = 10000 (triangles) at average degree (k) = 0.5 
(white symbols), (k) = 1 (gray symbols), and(fe) = 1.5 (black symbols). The curves 
corresponding to different system sizes with the same average coincide with each other. 



First, let us concentrate on the (k) < 1 regime. This corresponds to the dispersed 
state, where the graph consists of small isolated subgraphs. In FigfJl we plot the 
observed cumulative level spacing distribution for (k) = 0.4, 0.6, 0.8 and (k) = 1. In 
each case, the empirical results can be very well fitted by 

a a 

p(s) = — -exp(-as)s a_1 (6a) 
T(a) 

P(s) = ^l^l = P(a,as), (6b) 

where a G [0,1] is the fitting parameter, T(a),T(a,s) and P(a,as) denote the 
Gamma-function, the incomplete Gamma-function and the regularised Gamma-function 
respectively, defined as 

P CO 

T(a) = / t a ~ l exp(-t)dt, (7a) 
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Figure 2. The cumulative level spacing distribution P(s) obtained for (k) = 0.4 
(squares), (k) — 0.6 (circles), (k) = 0.8 (triangles), and (k) — 1 (diamonds). (The 
curves corresponding to (k) < 1 were shifted vertically to give a clearer view). In each 
case, the empirical P(s) can be very well fitted by P(a, as) (continuous lines). The 
inset shows the fitting parameter a as the function of the average degree (k). 



7(a,s) = / t*' 1 exp(-t)dt, (7b) 
Jo 

P(a,s) = (7c) 
r(a) 

The distribution given by (l6fe|) is normalised, and fulfils (|2J) as well. The inset in FigfSJ 
shows the relation between the fitting parameter a and the average degree, which can 
be expressed simply as 

a = 2 - (k) . (8) 

At the critical point of the percolation transition a becomes unity, therefore P(s) given 
by ( l66l) is transformed into Pq(s) = — exp(s), corresponding to Poisson statistics in 
RMT. 

In the (k) > 1 regime the graph contains a giant component. Close to the critical 
point, there are other smaller components present as well, however for large enough (k) 
the size of the giant component eventually reaches the system size. The level spacing 
distribution in the vicinity of (k) = 1 can be fitted with the Brody-distribution, given 
by (l5aH56l) . corresponding to a statistics in between Poisson and GOE. In Figj3l we 
demonstrate this effect by plotting — log[l — P{s)} as the function of s on a logarithmic 
scale. A cumulative level spacing distribution of the form (j5jjj) is thereby transformed 
into Cs a , appearing as a straight line with slope a. At the critical point of the 
percolation transition a = 1, therefore the slope of the corresponding curve (open 
circles) is unity. The slope of the curves is increasing with the average degree, and at 
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(k) = 2 it is already close to a = 2, corresponding to GOE statistics. In FigJU the 
fitting parameter a is shown as the function of (k), following a sigmoid curve, reaching 
the a = 2 limit closely above (k) = 2. 



-logfl^Cs)] 




0.1 1 10 

Figure 3. The level spacings of the E-R graph follow the Brody-distribution in the 
(k) > 1 regime. By plotting — log[l — P(s)} as the function of s on logarithmic 
scale, the Pb(s) given by Eq. ([56|) appears as a straight line with slope a. In the 
limiting case of (k) = 1 (open circles) the level spacing distribution is Poissonean with 
— log[l — P(s)] = s, corresponding to a straight line with unity slope. For (k) = 1.2 
(squares), (k) — 1.4 (triangles up), (k) — 1.6 (diamonds), (k) = 1.8 (filled circles) 
and (k) = 2 (triangles down) we can observe intermediate level spacing distributions 
between Poissonean and GOE, shown by straight lines with increasing slopes. 



4. Spectral transition in real networks 

Similarly to the E-R graph, spectral transitions can occur in real networks as well. In 
our studies we examined the behaviour of the level spacing distribution of the Internet, 
a word association network, and a protein-protein interaction network. In case of the 
Internet each node corresponded to an Autonomous System, and the links between the 
Autonomous Systems were obtained from the DIMES project [32]. The word association 
network was constructed from the South Florida Free Association norms list [13], in 
which a link from one word to another indicates that people in the surveys associated 
the end point of the link with its start point. And finally, the studied protein-protein 
network contained the DIP core list of the protein-protein interactions of S. cerevisiae 
[33]. These networks are all scale-free, and they consist of 14161, 10617, and 2609 nodes 
and 43430, 63788, and 6355 links, respectively. In each case, the largest connected 
component contained more than 90% of the nodes, and the level spacing distribution 
followed the GOE statistics. 
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Figure 4. The fitting parameter a in the (fc) > 1 regime as the function of (k). 
Starting from a = 1 at (£;) = 1, as the level spacing distribution transforms from 
Poissonean to GOE, the parameter a reaches a = 2 following a sigmoid curve. 



In order to obtain a percolation transition similar to that of the E-R graph, we 
applied the following recursion to all three networks until their giant components were 
destroyed: 

• calculate the node degrees, 

• remove the node with the largest degree. 

This algorithm is a variation of the method used to investigate the attack tolerance 
of networks, where the nodes are removed in the oder of their original degree |45j. 
Therefore, on one hand, the node removal process above can be also viewed as the 
simulation of the intentional damage of the investigated networks. On the other hand, 
the advantage of the present approach compared to the original process in |3S] is that 
we can generate several different configurations of the dispersed state: whenever the 
largest degree is possessed by more than one node, the algorithm arrives to a branch 
point with multiple choices for the next node removal. 

In Figj5^ we plotted the observed level spacing distributions at three stages in 
the node removal procedure, whereas FigJHb displays the accompanying cumulative 
component size distributions P{n), where n denotes the number of nodes in the 
components. The circles correspond to the Internet, the squares to the word association 
network, and the triangles to the protein-protein interaction network. The white 
symbols show the studied distributions in the original cohesive state of the networks: 
P(n) is dominated by an outstandingly large cluster, the giant component, and p(s) 
follows GOE statistics. By succedingly removing the nodes with the largest degree, 
the size of the largest component decreases, and in the vicinity of the critical point 
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Figure 5. a) The p(s) of the Internet (circles), the word association network (squares) 
and the protein protein interaction network (triangles) at three stages in the node 
deletion process: the original cohesive state (white symbols), in the vicinity of the 
critical point of the percolation transition (gray symbols) and in the dispersed state 
(black symbols), b) The accompanying cumulative size distributions P(n) with n 
denoting the number of nodes in the components. 

of the percolation transition P(n) is transformed into a power-law, and p(s) becomes 
exponential, as shown by the gray symbols. (These points result from averaging over 
multiple instances of the critical state, generated by the algorithm detailed above). 
By continuing the node deletion process, the networks fall apart into small disjunct 
components, P{n) transforms into a truncated distribution, and p(s) becomes peaked 
again, starting from p(s) = at s = 0, as shown by the black symbols. (Again, the 
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points show the result of averaging over multiple instances of the dispersed state). Even 
though the p(s) curves for the three different networks do not coincide with each other 
exactly in Figj5h, it is clear that they all undergo a similar transition to that of the E-R 
graph. 



5. Conclusions 



According to our investigations the percolation transition in networks is accompanied by 
a transition in the level spacing distribution as well. When a giant connected component 
containing the majority of nodes is present, p(s) follows the GOE statistics, whereas in 
the vicinity of the critical point of the percolation transition, p(s) becomes exponential. 
Dispersed networks consisting of many small, disjunct clusters have a p(s) starting 
from p(s) = at s = with a peak close to s = 0, and for the E-R graph, the 
corresponding cumulative level spacing distribution P(s) can be simply given by the 
regularised Gamma-function P(a,as). 
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